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1. Introduction 

Recent experimental progress in quantum information processing has highlighted the 
importance of quantum control and in particular, the task of system identification. 
For example, it is essential for the verification of quantum gates to be able to effectively 
identify quantum processes. This identification might require a full process tomography, 
but quite often an estimation of a number of free parameters may be sufficient to 
determine the proper operation of a gate. In such cases, it is obviously beneficial to 
use optimal schemes to estimate the free parameters. With that motivation we consider 
the problem of optimally estimating a quantum process with one free parameter. Figure 
HI represents a schematic of the problem. The goal is to identify an input state and a 
measurement scheme that will permit one to gain the most informaiton about the free 
parameter. We will make these notions more precise in what follows. 

The field of quantum parameter estimation is concerned with the methods of 
estimating - especially optimally estimating - properties of quantum states or processes. 
The field has a fairly short but rich history, beginning with the pioneering work of 
Helstrom [l] and Holevo 2J and has recently seen a rise in interest, especially from the 
quantum information community. 

The task of estimating quantum states has an enormous hterature dedicated to 
it (a small sample is IH El El E! , and for a recent review see Ref. [Hj). There are 
several notions of optimality in this scenario and some of them highlight the essential 
connection between the optimal estimation problem and the geometry of the space of 
quantum states [HlEl: the statistical distingushability of states induces a unique metric 
(the Fisher metric) on the space of quantum states [3 HH] • This is a pleasing state of 
affairs, which although not of much practical use (it cannot be used to identify optimal 
estimation strategies except in some special cases [TT]), is tremendously edifying. 

In contrast, parameter estimation for quantum processes is an area that is far 
less developed. The estimation of unitary quantum processes has been examined by a 
number of authors over the past few decades [H 121 Ull IH [12] ; in these treatments, the 
system evolves unitarily and the parameters to be estimated are unknown parameters of 
this unitary. However, the general form of the problem, where the evolution is a general 
quantum operation, has received considerably less attention. Some recent treatments of 
special cases of this more general problem are Refs. ^31 El Ej ■ 

One way to attack the problem of optimal quantum process estimation is to treat 
it as an optimal state estimation problem where the states under consideration are 
restricted to the parametrized (by the parameters of the process) set of output states 
of the quantum process. This is an practical point of view which recognizes that the 
only operational access to the parameters of the process is through probe input states 
that are measured at the output of the process (see Fig. H}. This approach breaks the 
process parameter estimation into two parts: the choice of an optimal input state, and 
the choice of an optimal estimation scheme for the parametric family of output states 
{Sg{po)}g. Such an approach has been taken in the past (e.g. [121) but with the dynamics 
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of the process (or equivalently, the dependency of the output parameteric family on 
the parameters of the process) being represented by rather abstract superoperators 
such as the symmetric logarithmic derivative P]. The disadvantage of using such a 
representation is that exphcit expressions for such superoperators are often difficuh to 
calculate. 

The aim of this paper is to use the same approach to the estimation of quantum 
processes, but to use a more common representation for them. In particular, we assume 
that the Kraus representation (operator sum decomposition) of the quantum process is 
given, with the parameters to be estimated being free parameters of the Kraus operators. 
We consider the simplest quantum process estimation problem in this general setting: 
the estimation of a one parameter, trace-preserving quantum operation; and investigate 
the advantages and disadvantages of using the Kraus representation to describe the 
process. We assume that the input state remains fixed and optimize over the estimation 
scheme to arrive at a measure of optimality. The measure is non-unique and non- 
constructive (it does not allow one to deduce the optimal POVM), primarily due to the 
non-uniqueness of the Kraus representation. However, for a special family of quantum 
channels we show that it can be used to test the optimality of an estimation scheme. 
We also discuss the geometry of the problem and apply the results to several examples, 
including one which illustrates the value of using entanglement to increase the statistical 
distinguishabihty of quantum processes. 

The paper is organized as follows: section |21 covers some preliminary concepts and 
sets the notation. Section IHl derives the optimality conditions for the estimation scheme 
and discusses them. Then we consider several examples in section0J and finally conclude 
in sectional 
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Figure 1. The estimation procedure 



2. Preliminaries 



2.1. Quantum operations 

Given two Hilbert spaces, Tii and 7-^2, with dimensions di and d2 respectively, a quantum 
operation, S : ?^f -^ 1-C^ is a completely positive linear map between trace-class 
operators in Tii and 0.2 (Jif is the space of trace-class operators acting on Hilbert space 
Hi] dirnHj = df). These maps represent the most general transformations between two 
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density operators in quantum mechanics. The complete positivity of the map is a 
physically motivated and highly restrictive condition, however, there is a well known 
representation of such maps originally due to Choi ^Hl and then popularized by Kraus 
[T7] , called the operator sum decomposition, or Kraus decomposition. This decomposition 
represents the action of the map S : Tif —>■ 7i^ as 

p^S{p) = Y,'^,pTl (1) 

k 

where p G Hj and p G 7Y|^. The operators Tk '■'Hi -^ 0.2 are called the Kraus operators 
of the decomposition. The sum in this decomposition can run over an infinite set, but in 
general S can be represented using at most did2 Kraus operators [18j. In this paper we 
will restrict our attention to trace-preserving quantum operations between isomorphic 
Hilbert spaces (i.e. di = d2 = d). The trace preservation implies that the Kraus 
operators satisfy a normalization condition: 

E "^I^^ = ^^ (2) 

k 

where Id is the d-dimensional identity operator. Of course, for a closed quantum system 
undergoing unitary evolution, there is only one Kraus operator in the decomposition 
and it is unitary - i.e. S{p) = VpV^ with V^V = Id- 

An important property of this decomposition is the non-uniqueness of the Kraus 
operators. A new set of Kraus operators for the same quantum operation can be derived 
by an arbitrary unitary remixing of the original set. That is, if {Tk}^ are the Kraus 
operators in a decomposition of S, then the set {Qj}^: 

N 

^j = E MifcTfc (3) 

fc=i 
where Ujk are the elements of a unitary matrix, are the Kraus operators for a 
different, but equivalent, operator sum decomposition of S. This non- uniqueness will 
become important when we describe the estimation of quantum operations in terms 
of their Kraus operators; in particular, a specific decomposition, called the canonical 
decomposition, will be important. The canonical decomposition is a distinguished Kraus 
decomposition that can be constructed with respect to any given input state with the 
following property: £{p) = J^k'^kp'^l with tr(T|.Tjp) = SjkPk- Trace preservation 
implies YlikPk = 1- The explicit construction of these Kraus operators is detailed in 
Ref. |1S], however for our purposes the important thing to notice is their dependence 
on the input state - if the input state changes, the Kraus operators in the canonical 
decomposition for a given quantum process will change. 

Now we describe an important interpretation of quantum operations. By the 
postulates of quantum mechanics, a closed quantum system will undergo unitary 
evolution. Any departure from such an evolution is caused by coupling of the system to 
additional degrees of freedom (usually termed the environment). Therefore, any non- 
unitary quantum operation is the effective description of the evolution of a quantum 
system that is coupled to an environment. This leads us to think of a quantum operation. 
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acting on a system of interest, as a unitary map of the system plus some environment 
(which combined form a closed system) after which the environment is traced out |18j . 
That is, 

£(p)=tre„,[f/(p®Pen.)f/1 (4) 

where p is a density operator for the system, and penv is a density operator for the 
environment, f/ is a unitary operator acting on both the system and the environment. 
This is sometimes referred to as the dilation of the quantum operation. Now, assuming 
that {|efc)} is a complete orthonormal basis for the state space of the environment and 
the initial environment state is p^nv = |eo)(eo|, then the Kraus decomposition of S can 
be written explicitly by performing the trace in this basis: 

S{p)= tTenv[U{p®Penv)U^] 

= 5^(efc|f/[p®|eo)(eo|]f/t|e,) 

k 
k 

where T^ = {ek\U\eo) is an operator acting only on the system subspace. The unitary 
freedom in the choice of Kraus operators is exactly the same as the unitary freedom in 
choosing the environmental basis states in which to perform the trace. The assumption 
of a pure initial state for the environment is not very restrictive because we can always 
choose the environment to be large enough so that this condition is met. The other 
assumption, that the initial state is separable, is a much more subtle one and a full 
discussion of it is out of the scope of this paper. It has been extensively discussed in 
the literature, and we refer the interested reader to the recent treatment in Ref. 

2.2. Optimal estimation 

We are concerned with optimally estimating a one-parameter quantum operation, Sq. 
We assume that we have available an operator sum decomposition of the operation, 
but that the Kraus operators have a free real, continuous parameter, 6, which is to be 
estimated. 

p{e)^Seipo) = J2^k{0)porlie) (6) 

k 

with JZk'^kiOyTkiO) = Id- Here po G H^ (dimTY"^ = (P) is the input state to the 
operation over which we have complete control. A schematic of the estimation process 
is given in Figure E In each run of the experiment, the input state po is fed into the 
quantum operation and a generalized measurement is performed on the output. The 
generalized measurement is described by non-negative, Hermitian operators E{C,), or 
POVMs, which satisfy the completeness property: / d^E{^) = Id- ^ labels the result/s 
of the measurement and can be univariate or multivariate, as well as continuous or 
discrete (in the discrete case, the completeness integral becomes a sum). Given such 
a description of the measurement, the probability density for the measurement result. 
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conditioned on a given state is given by ^(^16*) = tr(_E(^)p(6')). We will assume that there 
are N independent runs of this experiment after which the parameter 6 is estimated 
using the results of A^ independent, seperable measurements on N outputs from the 
channel. That is, 9est = (^est{^i,^2, ■■■,^n)- The estimator, T is any function of the 
N measurement outcomes, and it attempts to reconstruct 6 from these outcomes. We 
will assume that the estimator, T, is unbiased. That is, Eg{6est} = 6 where Eq{.} is 
the expectation value with respect to the probability distribution for 9. This is a mild 
assumption that will not affect the essential results. An example of an estimator (which 
is unbiased also) is the sample mean: fh = l/iVX]j=i^i5 which is an estimate of the 
true mean of the probability distribution that the Xi are drawn from. 

This is a very general setting in which to describe parameter estimation. We can 
account for scenarios such as entanglement assisted estimation by simply changing the 
definition of the quantum operation to be a tensor product of two or more operations. 
We will see an example of this in section 0] We are seeking the optimal scheme for 
estimating the free parameter, 6. Loosely, this amounts to specifying an input state, a 
measurement scheme, and an estimator that will permit one to gain the most information 
about 6, and we will now proceed by making this notion of optimality more precise. 

Firstly, the deviation of our estimate from the actual value of the parameter can 
be measured by: 

se = e,st - (7) 

It is natural to consider the estimation scheme that minimizes the variance of this 
estimation error as the optimal one. That is, we want to minimize {{66)^). Braunstein 
and Caves consider a similar problem in Ref. [Sj, and we will follow their treatment in 
order to find the optimal scheme. 

As mentioned in the introduction, this optimization problem can be split into two 
subproblems: (i) the choice of an optimal input state Pq, and, (ii) the choice of an 
optimal scheme (choice of {E{^)} and T) to estimate from the one parameter family 
of output states p{9) = £{po). In this paper, we will not consider the first subproblem 
- we will assume that the input state is fixed and focus on the optimization of the 
estimation scheme. The second subproblem, the optimization of the estimation scheme 
can be further broken down into two steps: first, a minimization of the error variance 
over estimators T for a given quantum measurement, and second, a minimization over 
all quantum measurements. The optimization over estimators is an entirely classical 
one - it is well known in the statistical inference literature and results in the famous 
Cramer- Rao bound j21j : 

«^*« s nW) '" 

where A^ is the number of measurement results used in the estimation, and F{6) is the 
Fisher information: 



F{d)^ dipim 



d\np{^\d)\ 

de ) 
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pK|9)V 90 J *^' 

The Fisher information represents the amount of information about 6 contained in 
the measurement result $,. The dependence of this quantity on the choice of quantum 
measurement is clear from the fact that p{^\0) = tT{E{C,)p{6)). Strictly, this form of the 
bound is only valid for unbiased estimators. As we will only consider such estimators 
we will not state the more general form of the bound here. 

The Cramer-Rao bound effectively takes the estimator out of the picture. It says 
that for a given input state and measurement (i.e. for a given p{^\0)) the variance is 
lower bounded by the the quantity on the right-hand side of Eq. (jH)). Thus subproblem 
(ii) - the estimation on the parametric family of output states - simplifies to finding the 
best measurement: the one that minimizes this lower bound, or equivalently, maximizes 
the Fisher information. As an aside, under mild regularity conditions on p{^\6) the lower 
bound in Eq. (jH)) is an asymptotically achievable one; that is, there exist estimators that 
can attain this bound as A^ — > oo, and an example is the maximum likelihood estimator 
[22] • The estimators that achieve this bound have been extensively studied in the field of 
statistical inference, therefore we will not consider them here but will rather concentrate 
on the quantum aspects of the problem. 

Given the Cramer-Rao bound, the next step of the optimization becomes a 
maximization of Eq. over all possible quantum measurements. We notate this 
maximization, and the result by: 

F*(e) = max F(e) (10) 

{E(0} 

2.2.1. A geometric perspective Before treating this maximization, we examine the 
process estimation problem from a geometric perspective. The one parameter family 
formed by the output of the process for a fixed input state, {p{6)}q, defines a curve 
in density operator space which is parametrized by the continuous, real parameter 9. 
The curve is itself a manifold defined by po = |'^o)('^o| and £e, and the advantage of 
regarding members of {p{0)}e as outputs of a quantum process represented by its Kraus 
decomposition is that we can define a natural local co-ordinate patch at each point 
on the curve t That is, p{e) = Efc '^fcWl^o)(V'o|Tl(^) = Efc |efc(^))(e,(^)|, where 
\ek{9)) = Tk{0)\ipo) are unnormalized vectors. Now if we exclusively use the canonical 
decomposition for the process, this can be rewritten as an eigendecomposition of p{0): 
EkPk{0)\fk{e)){fk{e)l where \Me)) = ^^\e,{e)) and {fMlMe)) = 5,,. The set 

{|/fc(^))} can be considered a local orthonormal co-ordinate basis at the point 9 on the 
curve. A point to note is that while we write the eigendecomposition as a sum over 
A;, the number of Kraus operators in the canonical decomposition, it does not mean 
that p{9) has the same number of eigenvectors as the number of Kraus operators there 
are in the Kraus decomposition. p{9) can have at most d eigenvectors, where d is the 

X Here we have assumed that the input state is pure. We wiU see below that the optimal input state 
will always be a pure one, and therefore this assumption is justified. 
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dimension of the system, while there is no hmit on the number of Kraus operators. In 
the canonical decomposition, Tk{6)\iljQ) = for some values of k, and these terms will 
drop out in the eigendecomposition sum. 

The Fisher information can be used to define a Riemannian metric on this curve 
(submanifold), that measures the statistical distinguishability of neighbouring one- 
parameter quantum operations given the fixed input state \ipo). To see this, we go back 
to the definitions above and note that the Cramer-Rao bound Eq. (jH)) is a lower bound 
on the variance in the error when estimating the parameter 6. Thus it is a lower bound 
on the error in reliably distinguishing between two neighbouring quantum operations: 
Se and Ee+de- Therefore, as in jH], we can consider it a distinguishability metric along the 
curve of one parameter quantum operations defined by |'?/'o) and £q. More formally, let us 
establish min[\/iV(((^6')^)^/^] to be a measure of statistical deviation. The a/ZV removes 
the improvement in estimation due to increased sampling, and the minimization is over 
measurement schemes to ensure that we are considering the most discriminating scheme. 
A statistical measure of distinguishability should be proportional to the inverse of this 
deviation measure - i.e. the more the deviation, the less distinguishable neighbouring 
operations become. Thus we can define a distinguishability metric along the curve as: 

~ mm[N{{5ey)] ^ ' 

[ds/dOY is well known as the statistical distance [3^, and using Eq. (jSJ, we can rewrite 
it in terms of the Fisher information as 

dsV 

- = max F(e) = F*(^) (12) 

dej {E{i)} 

This is exactly the maximization we are considering for optimal estimation. Note that 
it is only over the measurement POVMs {{E{^)}) because this is the statistical distance 
over a curve defined by a particular input state. A caveat is required here: when we 
refer to F*{6) as being the metric on the curve, this is strictly only true if it can be 
shown that the bound set by F*{6) can be achieved. This question of achievability will 
be important in the following. 

3. The optimization 

As outlined in section 12. 2^ the procedure of finding the best estimation scheme can 
be phrased as a sequence of optimizations. The first optimization, which is entirely 
classical, results in the Cramer-Rao bound, and in this section we shall examine the 
quantum aspects of the problem. 

3.1. Optimal estimation on the output family 

The optimal quantum measurement scheme is the set of POVMs that maximize the 
Fisher information, for a fixed input state. Using the definition of the Fisher information 
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Eq. Q, and the fact that p{^\e) = tr{E{^)p{e)) we get: 

F*(.)^max/.e^^^^i^^ (13) 

where p{e) = Se{po) = Efc '^ k{0) po^liO) and p '{6) = dp{e)/de. Let {Tk} be the Kraus 
operators for an arbitrary Kraus decomposition of S. Now, the next logical step would 
be to replace p{9) and p '{6) by their definitions in terms of the Kraus operators that 
define the quantum operation. However, this makes the maximization of ()13|) difficult 
due to the introduction of the Kraus decomposition sum in the numerator. Instead, we 
will take a step back and use the dilation of the quantum operation. 

As mentioned in section 12. ![ a quantum operation can be thought of as a unitary 
map of the system plus some environment after which the environment is traced out. 
Given this, we will label our system A and the environment B, and define: 

pM=^e{p\) = Y.^u{9)p\T\{e) 

k 

= tTB{U{e) [p^®|eo)B(eo|]f/H^)} (14) 

where U{9) is some unitary operator acting on systems A and B, and p^ is the input 
state on subsystem A. The mapping between U{9) and {Tk{0)} is not unique because 
of the freedom in choosing the environment basis states, and we will return to this point 
shortly. Also, 

p'^i9) = tiBi U '(9) [p^ \eo)B{eo\] U\9) 
+ U{9) [p'A®\eo)B{eo\]U^'{9)} 
= tTB{ni9) + n^{9)} (15) 

where U'{9) = dU{9)/d9, and n{9) = U '{9) [p^ ® |eo)B(eo|] W{9). 

Now we return to the problem of maximizing (fT!?|l . Substituting ()14|1 and (fTH|l . we 
get (in the following, we will use -Ea(0 ®-^-B and Ea^CjIb interchangeably to denote the 
same operator): 

{tTA{EA{i)tTB{^{9)+n^{9)}}f 



di 
di 



di 

<A j d^ 

di 



tiA{ EA{^)tiB{ U{9) [p\®\eo)B{e^\] W{9) }} 
(tr^{ tr^{ {EA{OlB)m9) + Q^{9)) } }Y 
tTA{ tTB{ {Ea{0Ib)U{9) \p\®\eo)B{eo\] U^{9) }} 
(tr{ {EA{0lBm9) + {EA{i)lB)^K9) }f 
tr{ {EA{i)lB)U{9) [p\ ® \e,)B{eoW U^{9) } 
(^ tr{ {EA{i)lBm9) }f 
tr{ {EA{i)lB)U{9) [p°i® |eo)B(eo|] W{9) } 

I tr{ {EA{0lBm9) }p 
tr{ {EA{i)lB)U{9) [p°i® |eo)B(eo|] W{9) } 
I tr{ {Ea{0 ® Ib)U \9) [p\ ® |eo)B(eo|] U^{9) }\^ 



tr{ {Ea{0®Ib)U{9) [po,® |eo)B(eo|] W{9) } 
We proceed by applying the Cauchy-Schwarz inequality: | tr(0^'P)p < tr(0"''0) tr(P"l'P), 
with equality when O = XP for some constant A. We will apply this inequality to 
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the numerator, where 0+ = {EAiO^BY^'^U '{6} [p°^ ® |eo)i?(eo|]^/2 and P = [p^ ® 
\e,)B{eo\Y^' UHe){EA{OlBy/', to get: 

p(n^ <A f ,. tr{ {Ej,{i)Is)U{e) [p>|eo)^(eo|][/t(g) } 
^ ^ - J Sr{ {EA{i)lB)U{e) [p\ ® |eo)B(eo|] m(^) } 

X tr{ (i5A(0/B)f/ \e) [p^ ® |eo)B(eo|] f/^>) } 



= AJ di tr{ (i?A(0®^B)f/'W[p>|eo)B(eo|] U^\e) } 

= 4tr{f/'(^) [p^®|eo)ij(eo|]f/t'(^)} 

= 4tr^trB{ U \e) [p\ ® |eo)B(eo|] U^\e) } 

= 4tr^{ X]n(^)p^Tt'(^)} 

k 

= 4tr{5]Ti'(^)n(^)po} (16) 

where we have dropped all subscripts in the last line because all operators are in 
subsystem A, and the completeness relation J d^E{^) = Id has been used. 
We have arrived at a bound on the Fisher information: 

Cr{e) ^ 4tT{Y,ri\9)r,{9) po} (17) 

k 

This bound is equal to the maximum, F*{6), if it is achievable. To achieve the bound 
we need to saturate the two inequalities used in the derivation. The condition for the 
meeting the first inequality is: 

3 tr{ (EAiO'^iBMe) } = o 

^^ S tr^{ EiOY.'^'kiO) Po rl{9) } = 

k 

^^ S tT{J2rl{9)E{Or,{9) Po} = Ve (18) 

k 

The condition for meeting the Cauchy-Schwarz bound is 

(E^(0®/B)^f/'(^) [p^®|eo)ij(eo|]^ 

= A^(^) (EAiO ® /B)^f/(^) [p'a ® |eo)B(eo|]^ V^ 

where the constant A^(^) can generally depend on ^ and 9. 

We would like a condition is terms of the Kraus operators instead of the unitary U. 
To do this, multiply from the left by an identity in the form Ia (S> Xlfe \^k)B{Gk\ to get 

J2[E{0'^TMpl]A®\e,)B{eo\ 

k 

= \^{9) J21 E{0'^^k{9)pl ]a ® \ek)B{eo\ V^ 

k 

^^ E{0'^T[{9)pI = X^{9)E{0'^T,{9)pI Ve,fc (19) 
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where the last step uses the orthogonahty of lefc)^- Eq. (fT^ defines a series of conditions 
that the optimal measurement must satisfy. In general, k runs from 1 to d'^ where d is 
the dimension of the state-space of p, and therefore we see simply from the number of 
constraining equations that the optimal measurement is severely restricted. 

We can reduce these two conditions to one by substituting (fnijl into the statement 
for the first condition (fTHjl 

QtT{Y,riie)Eior,ie)po} = o ve 

k 

^ 3 tT{J2 Tii9)E{0\ddnki9) Po} = Ve 

k 

=> 3 \^{9)tip{9)E{^) = V^ (20) 

Now, because the trace on the last line is always real, this condition is met if and only 
if A^(^) is real. Therefore to summarize, the optimal measurement scheme must satisfy 
the conditions: 

E{0'^r,{9)pl = X^{9)EiO'^T,{9)pl Ve, k (21) 

where A^(^) is a real number that can depend on ^. 

Condition 1)2111 is a condition on the optimal measurement POVM and the optimal 
input state. However, although it defines the optimal strategy, it is not a constructive 
condition. Except for special cases (that we will outline below) it is difficult to define 
the optimal measurement in terms of {T^} and po from the above condition. We will 
say more about the satisfiability of these conditions, and thus the achievability of the 
Fisher information bound below. 

We conclude this subsection by noting that an immediate consequence of the form 
Cr{9) is that a pure input state, po = |^o)(V'o|5 is optimal; this follows from the 
linearity of trace, the concavity of density operators, and the positivity of the operator 
Il^Ek'^l'{9)V,{9). 

3.2. Uniqueness and achievability 

We derived a bound on the Fisher information above, and here we will discuss the 
uniqueness and achievability of this bound. These are both important questions because 
the achievability makes the bound meaningful (and means that Cr{9) can be viewed as 
a metric on the curve formed by the parametrized output family) and the uniqueness 
makes it useful as a characterization of optimality. 

The Fisher information bound we have in Eq. |T7|) is non-unique because the 
Kraus operators of the quantum operation are not unique. For each choice of Kraus 
decomposition, the value of Cr{9) provides a possibly different upper bound to the Fisher 
information. So a natural question is: how does changing the Kraus decomposition 
modify the bound? 
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Let {Tfc(0)}, and {Qki^)} both be Kraus operator sets for Sg. Then from section 
12. II we know that these two sets are related by a unitary transformation: 

nk{e) = Y.u,,{0)T,{e) (22) 

j 

where Ujk{9) are the elements of a unitary matrix, which crucially, can also depend on 
9. Now, the Fisher information bound, Cq{6) for the Kraus operator choice {Qk{9)} in 
terms of the Kraus operators {Tk{0)} is: 

k 

= 4 ^ tr [( ^ </T,t + J2 ul,r/) ( J2 <i^i + E ^'^''^')'^o] 

k j j I I 

= 4( 5^«^^.nHtr[T,t'T;po] + $^«^,X,tr[T,tT,po] 

jkl jkl 

+ J2 ^l/^M tr [T/T;po] + Y. <^<i tr[T/'T,po] ) (23) 

jkl jkl 

From this expression it is clear that this bound, C^, can be made as large as desired 
by appropriately choosing the unitary matrix [ujj(6')]. In particular, it can be made to 
diverge by choosing a [uij{9y\ that is discontinuous in 6. Therefore, the sensible thing 
to consider is the minimum value of this bound - that is, the minimum of Eq. (fT7|) with 
respect to a choice of Kraus operators. 

The second issue to be addressed is the achievability of the bound. Let the Kraus 
decomposition be fixed, then to show the attainability of the bound, we must show 
that there always exists some POVM {E{E,)} that can meet the optimality conditions of 
Eq. pijl . This attainability is a subtle task, as was pointed out by Barndorff-Nielsen and 
Gill in Ref. [Sj. They show that in general, the optimal choice of POVM is dependent 
on the actual value of the unknown parameter ^; which means that in practice, an 
adaptive strategy that narrows in on the value of 6 has to be used ^. Only in some 
special cases fU] does one strategy achieve the bound uniformly over 9. We will refer 
to an estimation strategy that is optimal at some value of 6', but possibly not at other 
values of 6* as a locally optimal strategy. 

Therefore to give the bound Cx(^) a unique meaning, we would ideally like to 
identify the Kraus decomposition that minimizes the bound, and show that the optimal 
measurement conditions containing operators from this decomposition can be met by 
some POVM. That is, we want to minimize Cr{0) over the Kraus operators for a channel, 
while stipulating that the optimality conditions Eq. (PT|) are met (locally, or globally). 
At this stage we do not have a method of performing this optimization for the general 
case, but in the next section we carry it out for a special family of quantum channels. 

3.3. A special case: the quasi- classical process 

In this section, we identify a special case where the issues of uniqueness (or equivalently, 
minimality) and attainability of the bound in Eq. ()17j) can be settled. For this case, we 
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are able to show that the appropriate Kraus operators to use in calculating Cr{0) are 
ones forming the canonical Kraus decomposition induced by the fixed input state. 

Consider a one-parameter quantum process £0 and input state po = |'?/'o)('?/'o| 
with canonical decomposition £g{po) = ^^ ^fc(6')po^I(^)- As mentioned in 12.11 this 
decomposition is characterized by the orthogonality of the Kraus operators according to 
an inner product based on the particular state Pq: £e{po) = X^fc ^fc(^)Po^I(6') with 
ti{T\{9)T j{9)pq) = 6jkPk- Now, assume an additional orthogonality constraint on 
the canonical decomposition: tr{Tj{9)T'f^{9)po) = pk{9)5jk-, Pk € I^- In terms of the 
local basis on the curve this constraint can be written as {ej{9)\dgek{9)) = pk{d)^jk, 
which implies {fj{9)\dgfk{9)) = pk{9)6jk where /ifc G M also. Note that this additional 
constraint specifies a quasi-classical model where the eigenbasis of the output density 
operator remains the same for all 9, and it is only the eigenvalues that change along the 
curve - i.e. a locally orthogonal basis can also be considered a globally orthogonal basis 
along the curve. The output density operators {p{9)}g form a commuting parametric 
family. In the following, we prove the attainability and uniqueness of the Fisher 
information bound for this special case. 

Theorem 1 Attainability: For the quasi- classical model, the optimality conditions 

EiO'^TMpI = X^i9)EiO'^Tk{9)pl Ve, k (24) 

where {T^} are members of the canonical decomposition can be met with a locally optimal 
strategy. 

Proof: We will prove this by explicitly constructing the POVM that meets these 
conditions. Consider the optimality conditions when the input state is a pure state 

iV'o): 

EjTk{9y\^Po) = \Ad)EjTk{9Mo) 
EjldeCk) = Xj{9)Ej\ek) 

E, ( r^^l/'^^^)) + V^^)\9ohm 
\^VPk{9) 

= \mVMo)E,\fkm (25) 

for all J, k (we are now assuming that the POVM has a discrete number of elements 
and are thus using a discrete index j). Now consider the choice Ej = \fj{9)){fj{9)\ for 
the POVM - the completeness condition on POVMs is automatically satisfied because 
{|/j)} are eigenstates of a Hermitian operator and therefore span the space. Given this 
choice of POVM, the optimality conditions become: 

(^ + VP~kPk - XjVp^] \fk)S,k = (26) 

where we have suppressed the 9 because all quantities depend on it. This condition 
can be satisfied for all j, k by the choice A^ = pk + p'kf^Pk = Pk/Pk, and thus in 
this special case we can construct the optimal POVM. However, note that this choice, 
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^j ~ \fji^)){fji^)\' presumes knowledge of 9 and therefore can only be implemented 
adaptively [S]. As mentioned above, given the set {\fj{0))} for some 9, the additional 
constraint, {fj{0)\dofk{0)) = fik{(^)Sjk, ensures that this set remains orthogonal for 
all 9. Despite this, the POVM has to be adapted during the estimation because the 
normalization of the elements of the set varies with 6. D 

Theorem 2 Uniqueness: For the quasi- classical model, the minimum of the Fisher 
information bound of Eq. |7^ over the valid Kraus decompositions is achieved by the 
canonical decomposition. 

Proof: Firstly, by the preceding theorem, we know that for the quasi-clasical model the 
optimality conditions can be satisfied for the canonical Kraus operators - that is, some 
POVM set {E{^)} can be found such that the conditions of Eq. (J2II) are satisfied for 
the canonical Kraus operators, and thus the bound is achievable. Now, let Cx denote 
the Fisher information bound when the canonical Kraus decomposition is used. We will 
show the value of the bound using any other Kraus decomposition - i.e. Cq_ - is larger 
than Ct- Consider Cq_ as given in Eq. (J23|l . Using the identities J^k'^lj'^ki = Sji and 
tT{Tl{9)Tj{9)po) = 6jkPk, we can rewrite this expression as: 

Cn{9)=A{ 5^trT/T;po + $^|</p, 

+ E "^/"^' tr [T,tT;po] + J2 <3^'ki tr[T/'T,po] ) (27) 

jkl jkl 

Note that the first term is simply the Fisher information bound for the canonical Kraus 
decomposition, Cr, and the second term is always positive. Hence we see that the 
question of whether Cq > Cj- depends on the sign of the third and fourth terms; i.e. we 
are interested in the sign of: 

Gi0) = E ^^/^« tr [T/t;po] + E ul^u[, tr[T/'T,po] (28) 

jkl jkl 

To determine this, use the optimality conditions of Eq. (f?T|) to get rid of the derivatives 
within the trace. Explicitly, insert a resolution of identity of the form j dC,E{C,): 



jkl ^ 



''kj'^kl) 



jkl 

di\^{9)Y,^i[T,^E{i)Tip,] Y,(.ui;uki + ul 
ji k 

= (29) 

where the second line follows from the optimality conditions - Eq. (j2ip - and the third 
line follows from taking the derivative of the orthonormality condition on the elements 
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of the unitary matrix [uij{6)]. Therefore, 

Cn{0) 



\ j jk / 



and the minimum of Eq. ()17|1 for the quasi-classical model is achieved by the canonical 
Kraus decomposition. D 

For this quasi-classical model, we can truly say that Cr{0) = F*{6) (where {T^} 
form the canonical decomposition) and hence we have a measure of statistical distance 
along the curve formed by the one-parameter family of output states. In fact, we can 
express this metric on the curve explicitly in terms of the local co-ordinate system set 
up by the Kraus decomposition: 

%) = F\e) = AY,{deeumdeem) 

+AY,Pkm{fkmdefm)? m 

k 

This result has operational significance. It means that if the input state and 
quantum channel are such that the output family is a mutually commuting one, then 
the optimal estimation scheme can be identified, and the statistical distinguishability 
be calculated easily. 

4. Examples 

In this section we will consider several examples to illustrate the results of the previous 
sections. 

4-1. Qubit depolarization channel 

The qubit depolarization channel is defined as 

p{p) =^(P) =P^ + {'^ -P)P 

= (I-P)P 

+ ^{XpX + YpY + ZpZ) (31) 

where p is a density matrix in a Hilbert space of dimension two, and X, Y, Z are the Pauli 
matrices. This channel can be best understood by examining its action on the Bloch 
sphere representation of a qubit: it has the effect of uniformly shrinking the Bloch 
sphere towards the center. The parameter to estimate is the rate of this shrinking, 
parametrized by < p < 1. 

From the definition of the channel, it is clear that {p{p)}p forms a commuting family. 
Thus we can use the canonical decomposition of the channel to analyze its estimation. 
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The unitary invariance of this channel (spherical symmetry in the Bloch sphere 
picture) implies that all pure state inputs will perform identically when it comes to 
estimation performance. Therefore, choose {ipo) = |0), the +1 eigenstate of Z. The 
canonical decomposition of the channel with respect to this initial state is: 




(32) 

where q = 1 — p. The optimality conditions of Eq. (J2H) are easily seen to be satisfied 
by projective measurements onto the Z basis - i.e. the POVM {|0)(0|, |1)(1|}. The 
statistical distance for this channel with input state \ipo) = |0) is given by F*{p) = 
-7g3g-^, which can be achieved uniformly by the estimation strategy of measuring each 
output in the Z basis. Note that this bound diverges at p = 0, but not at p = 1. This 
is because at p = 1 we still cannot distinguish perfectly between the action of the three 
Paulis with one qubit. 





U'Ki'K^ 




So 




Measurement 
of joint state 




y » 




p(0) = 


■--. 













Figure 2. Estimating quantum channels using entanglement. If Eg is a quantum 
operation acting on operators in a Hilbert space of dimension d, then dim(|^)) is at 
least (P. 



It is a well known fact that using entanglement can improve estimation [23 ^] • To 
compare the performance of a scheme that uses entanglement to one that does not, we 
can compare the statistical distinguishability for the two cases. Consider the estimation 
of the depolarizing channel using a maximally entangled state as the input into the 
channel I (g) P; see Fig. |21 This is a common setup for estimating channels because 
it can be shown that the output state completely characterizes the channel j23- The 
canonical decomposition of the channel I (8) D with respect to the maximally entangled 
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input state \iPq) = ^(|00) + |11)) consists of the Kraus operators: 
Ti(p) 

T2(P) 



Vi-pi®i 

^/p/3l0X 



T4(p) = VpTs I ® Z. (33) 

Given this, it is easy to show that the optimahty conditions of Eq. (j^D) can be satisfied 
by a POVM formed by projectors onto the Bell basis §. Note that if the singlet state, 
|0~) is used as the input state, then the measurement scheme need only discriminate 
between the singlet and triplet subspaces to be optimal. The statistical distance in this 



case is F*{p) 



1 



p(i-p) ■ 



Figure El plots the two values of statistical distance, and clearly 



shows the improved distinguishability of the parameter (uniformly across p) that the 
entanglement assisted scheme affords. 




Figure 3. Statistical distinguishability of the depolarizing channel with and without 
the use of entanglement during estimation. The values of F* at p = and F* at 
p = 0,1 are not plotted because the quantities diverge at those points. 



This example illustrates the effect of using entanglement for estimation - it can 
have the advantage of increasing the statistical distinguishability of channels. It also 
illustrates the ability of this formalism to treat entangled input states and non-local 
measurements. Such scenarios simply change the definition of the channel to a suitable 
tensor product of single party channels, while the statistical distance and optimality 
conditions retain their form. 

§ The Bell states are four maximally entangled states of two qubits: jV''*') = -T^dOO) + |11)), |V' ) = 
^(|OO)-|11)),|0+) = ^(|O1) + |1O)),|0-) = ^(|01)-|10)). These four states span the Hilbert space 
of two qubits and are therefore called the Bell basis. The symmetric subspace of two qubit Hilbert 
space is spanned by the triplet states \ip^) and \(j)'^), and the anti-symmetric subspace is spanned by 
the singlet state \4>~). 
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4-2. Estimation of pure T2 qubit dephasing time 

There has been considerable interest recently in accurately estimating single qubit 
T2 relaxation times for various quantum computing architectures fl^ ESI I2Z|- Such 
estimations have also been commonplace in the NMR community for several decades 
now. We can use the formalism developed above to investigate the schemes used for 
estimating T2. 

If we restrict the dynamics to be purely dephasing, we can model the single qubit 
channel as 

^ = 7 {ZpZ - p) (34) 

where 7 is the dephasing rate and the parameter we are trying to estimate. A Kraus 
decomposition for this process is 

1 + e-^^ 1 - e~^^ 

p{9) = Vzip) = —^- p + -^^- ZpZ (35) 

where 6 = jt, is a simple transformation of the parameter we want to estimate. In 
the following we will notate the single qubit equal superposition states by: |+) = 
4=(|0) + |1)) and |— ) = ■;75(|0) — |1)). Note that these are eigenstates of the operator 
X and thus will be collectively referred to as the X-basis. 

The standard techniques for estimating T2 times are based on the spin echo J2H] 
pulse sequence which is essentially the preparation of a |+) or |— ) initial state and 
then a measurement in the X-basis after the channel has acted. This pulse sequence 
actually has added features designed to nullify bulk sample inhomogeneities, but the 
basic idea is as mentioned. It is easy to check that given this input state, Eq. (jH^ is the 
canonical decomposition for this channel, and also that the output family {p{0)}e is a 
mutually commuting one. Thus we can use the operators in the canonical decomposition 
to determine the optimality of this scheme by checking the optimality conditions (pTjl . 
which turn out to be 

{e-''~\,i9)il-e~''))Ey'\-)=0 (36) 

for all j, where we are using a discrete number of POVMs indexed by j. These 
two conditions can be met by a measurement in the X-basis; that is, with a POVM 
{£'+ = |+)(+|,-E_ = |— )(— |}. The choices required for A are: A+ = — l/(e^^ + 1) and 
A_ = l/(e^^ — 1). So we see that the standard spin echo technique of estimating single 
qubit pure dephasing time, T2, is indeed an optimal one for an X-basis input state. 

Again, it is possible to show that using entanglement helps in the estimation of the 
parameter 6 for this channel. However, a channel extension of the form I ® 'Dz does not 
increase the statistical distance, instead a channel extension of the form Vz (8> 'Dz must 
be used with a maximally entangled state. 
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4-3. The random shift channel 

The random shift channel is defined by the master equation 

^ = ^{U.pUl - p) (37) 

where 7 is a real, positive constant, and f//3 is a unitary operator: U = e^^^^ for 
some continuous spectrum Hermitian operator H, and some real number (3. A Kraus 
decomposition for this channel is 

nk/2 

A,(9) = ^= e~'/' U'^ k = 0,1,2,... (38) 

V A;' 

where 6 = jt. These equations describe a channel that delivers a Poisson distributed 

number of unitary displacements (or unitary 'kicks') by jS to the input state. The 

average number of kicks in a time t is given by ^ = 7^, and is the parameter we are 

estimating. 

The way to optimally estimate the unitary version of this channel p — > UppUl is to 
use the fact that if is a generator of translations in some basis [71 E]- That is, if \x) 
is an eigenstate of an operator conjugate to H, then U\x) = e~'^^^\x) = \x + (3). Then 
the optimal scheme is to input a fiducial state |xo) and to use a POVM that is formed 
from projectors onto translated versions of this state {\x) : |x) = Uf^\xo), /3 E M.} 
{{x\x') = 6{x — x')). For example, ii H = p, the momentum operator (and hence U is 
a spatial translation), then we would choose E and po to be projectors onto position 
eigenstates. 

Since Up is simply a representation of an abelian group and UpUpi = U^^pi we would 
expect the optimal scheme for estimating the random shift channel to be the same as 
that for estimating the unitary shift channel. Note that when the input state is a fiducial 
state |xo), which is translated by Up, the Kraus operators given by Eq. (jHHj) form the 
canonical decomposition. Additionally, the model is quasi-classical because the output 
family is a commuting one. Therefore, we can check the optimality of using the unitary 
channel estimation scheme for estimating the random shift channel by examining the 
optimality conditions with the canonical Kraus operators. Afc(6')' = {—\ + ^) Ak{6), 
and when the input state is |xo), the optimality conditions become: 

(-^ + ^ - h{0)) E'l\0^,{e)\^,) = Ve, fc (39) 

Now, choosing E{S,) = E{x) = \x){x\, we get: 

-^ + ^-A,')(5(x-(xo + A;/3))|x) = Wx,k (40) 

The left hand side is zero except when x = xq + k(3, and in that case we can choose 
\x = \ — ^^ so that the left hand side goes to zero in all cases. Therefore as in the 
unitary case, using a POVM formed of projectors onto shifted states is optimal when 
the input is an element of this same set. The statistical distance for this estimation 
scheme is F*{6) = 1/9. 



Optimal estimation of one parameter quantum channels 20 

4.4- The damping channel 

As our final example of a one parameter quantum process, we consider the harmonic 
oscillator damping channel (DC). This is a quantum process described by the master 
equation 

— - = 7(apa^ (a'ap + pa) a)) (41) 

at 2 

where a) and a are creation and annihilation operators for a harmonic oscillator mode, 

and 7 is a real, positive constant. This channel describes the effects of random photon 

loss. An operator sum decomposition for this process can be obtained by expanding the 

above master equation as a Dyson series and solving. This yields the following Kraus 

operators: 

Afc(^) = ^ ^^e-2'^'"a'= fc = 0,1,2,... (42) 

where 6' = 7t is the parameter to be estimated for this channel. Note that the state space 
of p{d) is infinite dimensional and there are also an infinite number of Kraus operators. 
One interpretation of this quantum operation is that it describes the transformation 
of a state when combined with the vacuum at a beam splitter (see Fig. |3]). That is, the 
state of mode a after the beam splitter is given by 

Pa = tlbUmPa ® |O)5(O|)f/t(0) (43) 

where t/(0) = exp{—i(j){d'^b + db'')) is the beam splitter unitary transformation with a 
and b being the annihilation operators for modes a and b respectively. Evaluating this 
trace gives the same CP map as the damping channel with e~^ replaced by cos^ 0, the 
intensity transmittance of the beam splitter. Therefore our estimation task is equivalent 
to the estimation of the transmittance of a beam splitter. 



Pb 




^" ^ Pa 

Pb 

Figure 4. The beam splitter interpretation of the damping channel 

A common method for probing such a channel would be with a Fock (photon 
number) state \iIjq) = \N), where N = 0,1,2,... is the number of photons in the 
mode. Simlarly, common measurement techniques at the channel output would be 
photodetection, heterodyne, or homodyne measurements. 

To determine the optimal measurement strategy when photon number states are 
used as input, we again confirm that firstly, the Kraus decomposition defined by Eq. (jl^ 
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is indeed the canonical one when number states are used, and secondly, that the channel 
is quasi-classical with such input states. Therefore we can decide on the optimal POVM 
by looking at the optimality conditions of Eq. (J2T| . A'^,(6') = ( 2(i-e-<>) ~ |^^o^) ^kid), 
and hence the conditions become: 

Up-9 1 

-E'/\OM0)\N) - -E^!\i)a^a^M\^) 



2(1 -e-^) ^-^ — v-^rw 2 

= \{Q)E^I\i)^m\N) (44) 

for all ^ and k. We can simplify this by applying all operators except -E(0 to the input 
state. Note that ioi k > N the application of Ak{6) yields zero and the condition is 
trivially satisfied. For k < N we have the condition that for all ^: 

^^^^ - ^ - ^dO)) E'^mN -k) = (45) 

To satisfy this we can choose -E'(0 — E{M) = \M){M\, a number state projector, 
which corresponds to photodetection. This choice requires Am = ^n.g-rif • Therefore 
the best strategy for estimating the damping channel when Fock states are used as input 
is to perform photodetection at the output. The statistical distance for this scenario is 
En{^) = ^^Ti' ^^^ hence larger A^ at the input makes the process more distinguishable. 

4-5. A comment on estimation 

In all the examples considered above, except the random shift channel, we are trying 
to estimate a continuous parameter with experiments that have discrete outcomes. 
This may seem peculiar, but the situation is clarified by the observation that the 
parameter is always a continuous function of the probabilities of the discrete outcomes 
(or rather, the probabilities are functions of the parameter). Therefore from the point 
of view of the estimator, the problem is the same as estimating the parameter of a 
probability distribution from independent experiments that sample that distribution. 
This is a well known estimation problem in classical estimation theory and a maximum 
likelihood estimator [221 would be a practical estimator that would also achieve the 
Fisher information bound asymptotically. 

5. Conclusion 

We have investigated the problem of optimally estimating a general one parameter 
quantum process. We have attempted to obtain characterizations of estimation 
optimality in terms of a common representation of quantum processes, the Kraus 
decomposition. We derived a bound on estimation accuracy and conditions of optimality 
when the input state is fixed, however, the non-uniqueness of the Kraus decomposition 
causes this bound to be non-unique. It also makes proving the achievability of the bound 
difficult. However, we have shown that in the special case of a quasi-classical channel, 
the issues of uniqueness and attainability can be settled, and in this special case, the 
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characterization of optimal estimation (with a fixed input state) we derived is useful in 
determining the statistical distinguishability of quantum processes. 

Representing a quantum process in terms of its Kraus decomposition has the 
advantage that it is often easy to do, however, in view of the above treatment, this 
representation is difficult for characterizing optimal estimation strategies because of its 
non-uniqueness. The immediate direction in which this work could be extended is to 
investigate the possibility of settling the issues of attainability and uniqueness for a 
general quantum channel. In particular, explicit expressions for the optimal POVM 
from the optimality conditions Eq. ()21|). for a fixed set of Kraus operators, would be 
extremely useful. 
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